Mean-variance optimality for semi-Markov decision processes under first passage criteria
نویسندگان
چکیده
منابع مشابه
Risk-Sensitive and Mean Variance Optimality in Markov Decision Processes
In this note, we compare two approaches for handling risk-variability features arising in discrete-time Markov decision processes: models with exponential utility functions and mean variance optimality models. Computational approaches for finding optimal decision with respect to the optimality criteria mentioned above are presented and analytical results showing connections between the above op...
متن کاملMean-Variance Optimization in Markov Decision Processes
We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for oth...
متن کاملOn mean reward variance in semi-Markov processes
As an extension of the discrete-time case, this note investigates the variance of the total cumulative reward for the embedded Markov chain of semi-Markov processes. Under the assumption that the chain is aperiodic and contains a single class of recurrent states recursive formulae for the variance are obtained which show that the variance growth rate is asymptotically linear in time. Expression...
متن کاملContinuous-time Markov decision processes with nth-bias optimality criteria
In this paper, we study the nth-bias optimality problem for finite continuous-time Markov decision processes (MDPs) with a multichain structure. We first provide nth-bias difference formulas for two policies and present some interesting characterizations of an nth-bias optimal policy by using these difference formulas. Then, we prove the existence of an nth-bias optimal policy by using nth-bias...
متن کاملAlgorithmic aspects of mean-variance optimization in Markov decision processes
We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for oth...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Kybernetika
سال: 2017
ISSN: 0023-5954,1805-949X
DOI: 10.14736/kyb-2017-1-0059